Intro

Column

Selected songs

15

Experts curated our playlist

3

Different countries

26

Classes

3

Column

Our Research

Consider the following snippet of music:


What are your thoughts about this, did you like or dislike it? What made you dislike the song? Is there a particular instrument you like? Can you dissect the music? Is it a genre of your preference?

In short, what would you guess is the most important aspect of the music that made you decide whether you liked it or not? This question is at the hart of our research. We wanted to know why people valued certain music as beautiful and other music not, and hopefully you do too! Click through the website to find out more!

Background information

Row

Musical sophistication and beauty

The majority of people would agree that perception of beauty, especially in music, is a highly subjective phenomenon. Or is it? In our research, we aim to explore the relationship between beauty assessment and musical sophistication.

Previous research done on aesthetics and music mainly focusses on personality traits and how those influence perception. For instance, awe is one of the profound aesthetic experiences, often described as being touched, moved, fascinated and amazed. It was found that people who are more open to experience are more susceptible to awe-like states (Silvia et al., 2015). Interestingly, the study was conducted in 2 domains, that is visual and auditory stimuli were used. Across both domains, openness to experience was the only factor predictive of the higher experience of awe. One of the drawbacks of the methodology is that judgements were made by listening to only one song (‘Hoppípolla’ by Sigur Rós). Furthermore, although none of the participants understood the Icelandic language, the overall perceived ‘melodicity’, familiarity, etc. of the language could have affected the perception of the song (Jenkin, 2014). In the present study, we use only instrumental music to control for the language factor. Furthermore, 15 music snippets were chosen as stimuli.

Usually, listening to music is an aesthetic experience that requires activation of not only affective, but also cognitive and evaluative processes. Studies have found that music expertise modulates the cortical processing of different aspects of music perception (e.g. Atienza et al., 2002; Bosnyak al., 2004). Cognitive researchers (Müller et al., 2010) compared aesthetic judgements between experts and laypersons by using event-related potential (ERP) measurements. They found that when exposed to the same stimuli, experts’ and laypersons’ ERP measures systematically differed. We believe that if there is a difference in aesthetic

judgements between groups of experts and laypersons on the ‘brain’ level, then there should be an observable distinction on a more conscious level, too. As in Müller’s paper, the question of whether the piece is beautiful or not, as opposed to ‘do you like it’, used in the majority of previous papers (e.g. Brattico & Jacobsen, 2009) will be used to quantify beauty assessment. In this way, the question becomes more linguistically sound and precise.

Sophistication is not the only factor that could potentially influence beauty perception. Genre preference might also have a profound impact on whether the participant finds a piece beautiful or not (Istók et al., 2013). The modernist view of music aesthetics (Burke & Gridley, 1990) supports the idea of genre hierarchy. This theory states that complex music, such as jazz, is less popularly valued because of its high intellectual demand. Followers of the theory would argue that jazz is a genre that can be comprehended and appreciated only by musically sophisticated individuals. For this reason, our stimuli includes a variety of genres, for instance jazz (Drama in Six Notes), bluegrass (Less is Moi) and electronic (syro u473t8+e [141.98]). In addition, we will check if certain genre preferences correspond with higher musical sophistication scores.

Inspired by the aforementioned literature and also personal experiences, we would like to see if musical sophistication influences the perception of beauty. We hypothesize that higher scores for music sophistication will align with higher scores for beauty. Furthermore, we will analyze the potential correlation between said scores and genre preference.

Review: Are musicians particularly sensitive to music?

In the article “Are Musicians Particularly Sensitive to Beauty and Goodness” (Güsewell & Ruch, 2014) the degree and form of musical practice of participants is compared to responsiveness to artistic, natural and non-aesthetic beauty and goodness. This was examined using self-report and stimulus-based instruments. It was found that professional musicians had the highest scores in responsiveness to artistic beauty, experience seeking, and absorption compared to the other groups. The amateur musicians scored highest on overall responsiveness, responsiveness to non-aesthetic goodness and responsiveness to nature. This supports the hypothesis that there is a link between sensitivity to beauty, goodness and musical practice. From the data, researchers concluded that the responsiveness to beauty was related to the degree of involvement in musical practice. It was suggested that the opportunity to artistically express oneself was needed for a balanced responsiveness to the beauty profile. The groups that scored highest in responsiveness to beauty were believed to have more opportunities to express themselves (amateurs and soloists) or take part in musical activities where strong expressive and artistic involvement was needed (high-level orchestra musicians). However, the participants were not grouped based on the opportunity for expression through music. Thus, based on the results of soloists and amateur musicians it is not yet possible to conclude that personal interpretation of music increases responsiveness to beauty. In our study, we focus on the musical sophistication of a larger sample, including participants with both low and high scores. Such a sample might provide a more in-depth look at the relationship between musical experience and perception of beauty.

Exploring Beauty through different characteristics

Column

The perception of Beauty

No-one has an objective view of beauty, as beauty can be seen in anything and by anyone. This was the main notion that we wrestled with when setting up our research. Since we could not come up with a definitive answer on what stimulates people to perceive beauty, we decided to make our very own playlist, based on what we found either interesting and different music ourselves. Based on this playlist we let 3 experts analyse these songs on their musical components, so that we could curate a diverse playlist containing all different kinds of musical elements. By having such a diverse playlist, the aim was that we could classify people based on what their preferences would be. The playlist consists of 15 songs and can be found under the songs tab. Each song is played for 30 seconds after which the people either rated the song as beautiful or not with a yes/no answer.

To find out how people differ in their perceptions of beauty, we decided to collect additional data about their characteristics. The additional characteristics were nationality, gender, age, genre preferences and musical sophistication.Visualisations of those characteristics are found on the right.

The end-goal of our research is to divide people into classes by their musical preferences, and then check whether there were any significant changes in characteristics per class.

Characteristics of our sample

In this section, we will analyse the graphs to determine the characteristics of our sample.

Before we begin with the analysis, we would like to mention that our data might be a bit biased, as our participants are predominantly females that are in their 20s.

Nationality
As you can see from the nationality graph, the majority of our participants are from The Netherlands. Nevertheless, we are proud to mention that we managed to gather a group of very diverse participants. Coming from 26 different countries, we have participants from all over the world, ranging from Asia, Africa, Europe and America. Majority of our participants are either European or Asian.

Genre Preference
To assess genre preferences, we asked participants to rank three of their best preferred genres through a questionnaire. When looking at the genre distribution graph, we can see that the majority of our sample prefers listening to pop over other genres. This might be due to the fact that our participants are predominantly in their 20s. Tied next in line is classical and rock music. This is then followed by soundtracks and alternative. Country, electronic dance, jazz, oldies, hip hop, soul and R&B are genres that are also relatively preferred by our sample. The rest of the remaining genres are not as well-preferred, with Bluegrass ranked last. This might be due to the fact that it is not a common and thus, unfamiliar genre to the participants.

(Genre preference using STOMPR)
We also conducted a STOMPR test for making subclasses for later analysis. The Short Test of Music Preference-Revised (STOMPR) is a test, designed to assess music preferences that are related to personality variables, self-views and cognitive abilities. The test consisted of 4 sub-scales (reflective & complex; intense & rebellion; upbeat & conventional; energetic & rhythmic) used to measure music preferences. 1. Reflective & complex: classical, blues, folk and jazz 2. Intense & rebellious: alternative, rock and heavy metal 3. Upbeat and conventional: country, religious, pop, soundtracks 4. Energetic & rhythmic: electronic dance, hip-hop, soul and funk We divided the top 3 genre preferences within these STOMPR groups. The height of the bar is thus the number of times these genres were mentioned as most preferable in the group.

From the graph, with upbeat & conventional category mentioned most, we can infer that the majority of our participants prefer country, religious, pop and film music over other genres. Second in line would be classical, blues, folk and jazz. Alternative, rock and heavy metal are relatively liked while electronic dance, hip-hop, soul and funk music are least preferred.

Gold-MSI
To test individual differences in musical sophistication, we used the Goldsmiths Musical Sophistication Index (Gold-MSI). The following 5 aspects are measured using a self-report questionnaire. 1. active musical engagement: the amount of time and resources spent on music 2. self-reported perceptual abilities: the accuracy of musical listening skills 3. musical training: the amount of formal musical training received 4. self-reported singing abilities: the accuracy of singing 5. sophisticated emotional engagement with music: the ability to talk about the emotions that music expresses The higher the score, the more musically sophisticated an individual is.

To see if musical sophistication affects genre preferences, we looked at the distribution of the Gold-MSI scores in STOMP groups. From the graph, we can infer that all 4 categories have pretty similar median scores. The median score is highest in the energetic & median category. This means that generally, a higher level of musical sophistication is required to appreciate electronic dance, hip-hop, soul and funk music. A slightly lower level of musical sophistication is necessary to appreciate genres in the upbeat & conventional category. This is not only shown by a lower median score, but also by the much lower lowest score, in comparison to the energetic & median category.
This difference is however not as obvious for the 2 other remaining categories. We can see that the median score for both the intense & rebellious and reflective & complex categories are not that different. According to the density of the plot, while more people with a lower score would appreciate genres in the intense & rebellious category, this difference is not obvious. The distribution and interquartile range don’t differ for both categories. In fact, we can see that the difference across all 4 categories are not obvious. Hence, from the graph, we can conclude that musical sophistication does not have much effect on genre preferences.

Column

Nationalities

Genre preferences

STOMP

Musical sophistication and genre preference

Compiling the playlist

Column

Rating the Songs

When starting out our survey, we searched online for known datasets that included musical pieces that were dissected on their musical components. This was important as we wanted to compare musical sophistication with musical pieces and we needed information about the structure of these pieces. Since this yielded no results, we decided to each supply 5 instrumental songs to a playlist on Spotify. These songs needed to be instrumental to control for the influence of language on the perception of beauty. After we compiled 30 songs, we then used 3 musical experts with more than 10 years of formal training to rate them on 9 components on a 10-point Likert-scale, copying the method used in the article of Aljanaki et al. (2016). The following components were:

  • Tempo: the general pulse of the song, ranging from very slow (1) to very fast (10)

  • Articulation: The rhythmic articulation of each song, ranging from very staccato (1) to completely legato (10), staccato are separate notes with rests in between, legato notes are notes that are strung together.

  • Mode: overall mode and feel of the songs, ranging from minor (1) to major (10)

  • Intensity: overall loudness and crescendos and decrescendos in a song, ranging from 1 (pianissimo) to 10 (fortissimo)

  • Tonalness: overall tonalness of the composition, ranging from (1) atonal, with no discernable mode or key to tonal (10) with no use of “outside” extensions and very clear discernable key and mode

  • Pitch: overall distribution of the pitches, ranging from all bass (1) to all treble (10)

  • Melody: overall presence and dominance of melody, ranging from very unmelodious (1) to very melodious (10)

  • Rhythmic Clarity: overall presence of a pulse, ranging from very vague (1) to very firm (10)

  • Rhythmic Complexity: the extent to which different meters, odd tempo’s or complex rhytmic patterns are utilized, ranging from very simple (1) to very complex (10)

After all songs were rated, we selected 15 songs to include on our survey based on A) Feature Representability and B) Reliability


A) Feature Representability
The panel on the right is interactive, hover over a point with your mouse to find out more

The combined box and jitterplot shows the overall distribution of the characteristics of the selected songs. The boxplot represents the feature values of all 30 songs. The jitterplot shows the feature values of the 15 songs we selected for our survey.

Examining the jitterplot, it becomes apparent that our selection covers quite a large range for most components, with a range of around 6 for most components. Certain interest should be given towards the component of Pitch, which features mostly average Bass/Treble compositions, with 1 lower range song.

Overall this looks to be an okay distribution of songs, given that the playlist was compiled by 6 different people with different preferences. For some components however, a more extreme rating would be preferred so we would’ve had more room to examine the eventual class differences.

Box and jitterplot of average expert rating per song

Column

Assessing reliability

To start our selection of 15 songs, we first estimated the reliability of the expert ratings per song. To do this we computed distance scores between each of the 3 experts. For example, each rater provided a rating of the component Tempo for a given song. The first rater gave it a 5, the second rater gave it a 6 and the third a 7. The distance could then be calculated by taking the distance between the first and the second rater (6 - 5 = 1), the distance between the second and the third rater (7 - 6 = 1) and the distance between the first and the third rater (7 - 5 = 2). We then summed the difference (1 + 1 + 2 = 4), which provided an estimate of rater consensus on the component tempo.

Subsequently, this was done for all components per song, and then all the reliability scores per component were summed to give an estimate of overall reliability. The table on the right shows these scores for all 30 songs.

As can be seen from the table, the reliability scores range between 26 and 64, with a lower score representing better consensus on that song. Based on these scores, we estimated a cut-off point for song selection (reliability score < 45), and used this to select our songs. Upon examining our prior selection however, it became apparent that the distribution of tempo ratings was skewed to favour higher tempos and not enough atonal songs (low score on tonalness). To correct for this we decided to swap the song Sesiu Nata Drama (reliability score of 46) of the song The Kiss (reliability score of 42), to make sure our songs represented most of the component ranges. In the next segment we will examine this further.

Reliability scores per song

Songs

Row

Snippets

Blueming

Bygone Bumps

Cia Pat

Decision (Price of Love)

Elysium

Firth Of Fifth

Less Is Moi

Married Life

Resolver

Scarface Theme

Single Petal Of A Rose

Song For A New Beginning

syro u473t8+e

Šešių Natų Drama _ Drama In Six Notes

USA III Rail

Full songs

Contact us

Row

Contact information and Names

Hello, and welcome to our portfolio. We are students from the honours course The Data Science of Everyday Music Listening coordinated by dr. J.A. Burgoyne, and we wanted to know more about the beauty of music. In our brainstorm sessions we concluded that the experience of musical beauty differs from person to person. We wanted to know if someone’s musical sophistication influenced what songs they deemed beautiful. In this portfolio you will find the method and results of our research and we hope you will enjoy it. Sincerely, Willem Pleiter, Kristina Savickaja, Xiaoqing Li, Denise Quek, Nikita van ‘t Rood and Esther Liefting.

For further information, please contact us at

Splitting the Sample

Column

The LCA

Latent Class Analysis or LCA is a psychometric method in which participants are grouped based on how likely they would respond positive to a certain survey item, in our case a song snippet that is either beautiful or not. After running the results of our 119 participants through the LCA, it appeared that only a 3-class model fitted the data properly, so that was our choice.

An LCA-table can be interpreted as follows: Firstly, at the top of our table, the class names are given. We’ve chosen to name our classes the Likers, Indifferents and the Dislikers. The Likers comprised 38% of our sample, the indifferents 48% and the dislikers were 14% of our sample. Below these class you see cells belonging to a song, when you hover over these cells with your mouse you see another percentage, so for instance the top left cell (Married Life, Likers) has 100%, this means that a person belonging to the class of Likers has a 100% chance of liking this song. At the bottom is Syro, where there is only a 15% chance that someone in the “Likers” class will like that song.

These conditional probabilities are available for all classes, so you can hover over the table to see which songs were generally very liked and which were disliked. The idea behind the LCA is that you yourself will belong to one of these classes when filling out our survey, depending on how consistent your response pattern is with one of the classes. If you’ve listened to our samples, you might already have an idea to which class you could belong.


ANOVA
On the second tab, the posthoc results are visible from our ANOVA. Analysis of variance (ANOVA) is a method in which means and variance between groups (in our case: classes) are compared to see whether the differences in Gold MSI scores are significant (meaning that they are not due to variation or natural occuring error). If an ANOVA is significant, than it means that the groups are different, but the ANOVA itself will not tell you which groups differ from each other. Assumptions of normality and equal variance were checked and came out OK for our test.

To find out which groups differ from each other, so called post-hoc tests can be used, which compare the groups one by one, instead of all at once like ANOVA does. A conventional method to run post-hoc analyses is the Tukey-Test, of which you see the output in the right panel. This test can be interpreted as follows: on the y-axis you see which test has been done (either comparing indifferents with likers, likers with dislikers and so on). The x-axis tells you the size of the difference. When looking at the plot itself, three intervals are visible. These are the 95% confidence intervals, which can be defined as the interval that in 95% of the cases will contain the “true” difference in Gold MSI score. Since the red lines do not contain a value of 0, we can say that if we were to repeat the experiment an infinite amount of times, the differences between measured scores will almost never be 0, so that means that allows is to conclude to refute the hypothesis that these groups do not differ in Gold scores.

Phrased more clearly, the plot shows evidence that the Likers tend to have higher Gold-scores than both the Dislikers and the Indifferents, but there is no difference between the Dislikers and the Indifferents (since the confidence interval contains a zero). This confirms our hypothesis. In the next section, we’ll illustrate how big these differences are exactly and examine if any other characteristics can explain the difference found between groups.

Column

LCA class table

ANOVA Post-Hoc Results

Sample charachteristics by class

Column

Class descriptions

Here we will interpret the graphs

Column

Vizualisations

Here we will visualize some Gender, STOMP-scores, Gold-MSI and other types of class characteristics that are interesting in a tabset form

Conclusion and Discussion

Column

Conclusion

Here we will interpret the graphs

Column

Discussion

Here we will visualize some Gender, STOMP-scores, Gold-MSI and other types of class characteristics that are interesting in a tabset form